data worker
- North America > United States > Arizona (0.04)
- Europe > France (0.04)
- North America > United States > Virginia (0.04)
- (9 more...)
- Questionnaire & Opinion Survey (1.00)
- Personal > Interview (1.00)
- Research Report > New Finding (0.67)
- Overview (0.67)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
- (2 more...)
- Information Technology > Data Science (1.00)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- (4 more...)
- North America > United States > Arizona (0.04)
- Europe > France (0.04)
- North America > United States > Virginia (0.04)
- (9 more...)
- Questionnaire & Opinion Survey (1.00)
- Personal > Interview (1.00)
- Research Report > New Finding (0.67)
- Overview (0.67)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
- (2 more...)
Building Benchmarks from the Ground Up: Community-Centered Evaluation of LLMs in Healthcare Chatbot Settings
Hamna, null, Bhat, Gayatri, Mukherjee, Sourabrata, Lalani, Faisal, Hadfield, Evan, Siddarth, Divya, Bali, Kalika, Sitaram, Sunayana
Large Language Models (LLMs) are typically evaluated through general or domain-specific benchmarks testing capabilities that often lack grounding in the lived realities of end users. Critical domains such as healthcare require evaluations that extend beyond artificial or simulated tasks to reflect the everyday needs, cultural practices, and nuanced contexts of communities. We propose Samiksha, a community-driven evaluation pipeline co-created with civil-society organizations (CSOs) and community members. Our approach enables scalable, automated benchmarking through a culturally aware, community-driven pipeline in which community feedback informs what to evaluate, how the benchmark is built, and how outputs are scored. We demonstrate this approach in the health domain in India. Our analysis highlights how current multilingual LLMs address nuanced community health queries, while also offering a scalable pathway for contextually grounded and inclusive LLM evaluation.
- Asia > India > Karnataka > Bengaluru (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- (8 more...)
- Research Report (1.00)
- Personal > Interview (0.93)
- Overview (0.93)
Secondary Stakeholders in AI: Fighting for, Brokering, and Navigating Agency
Ajmani, Leah Hope, Abdelkadir, Nuredin Ali, Chancellor, Stevie
As AI technologies become more human-facing, there have been numerous calls to adapt participatory approaches to AI development -- spurring the idea of participatory AI. However, these calls often focus only on primary stakeholders, such as end-users, and not secondary stakeholders. This paper seeks to translate the ideals of participatory AI to a broader population of secondary AI stakeholders through semi-structured interviews. We theorize that meaningful participation involves three participatory ideals: (1) informedness, (2) consent, and (3) agency. We also explore how secondary stakeholders realize these ideals by traversing a complicated problem space. Like walking up the rungs of a ladder, these ideals build on one another. We introduce three stakeholder archetypes: the reluctant data contributor, the unsupported activist, and the well-intentioned practitioner, who must navigate systemic barriers to achieving agentic AI relationships. We envision an AI future where secondary stakeholders are able to meaningfully participate with the AI systems they influence and are influenced by.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > New York > New York County > New York City (0.07)
- (8 more...)
- Questionnaire & Opinion Survey (1.00)
- Research Report > New Finding (0.93)
- Health & Medicine (0.94)
- Social Sector (0.93)
- Law (0.93)
- (2 more...)
FactFlow: Automatic Fact Sheet Generation and Customization from Tabular Dataset via AI Chain Design & Implementation
Vu, Minh Duc, Chen, Jieshan, Xing, Zhenchang, Lu, Qinghua, Xu, Xiwei, Fu, Qian
With the proliferation of data across various domains, there is a critical demand for tools that enable non-experts to derive meaningful insights without deep data analysis skills. To address this need, existing automatic fact sheet generation tools offer heuristic-based solutions to extract facts and generate stories. However, they inadequately grasp the semantics of data and struggle to generate narratives that fully capture the semantics of the dataset or align the fact sheet with specific user needs. Addressing these shortcomings, this paper introduces \tool, a novel tool designed for the automatic generation and customisation of fact sheets. \tool applies the concept of collaborative AI workers to transform raw tabular dataset into comprehensive, visually compelling fact sheets. We define effective taxonomy to profile AI worker for specialised tasks. Furthermore, \tool empowers users to refine these fact sheets through intuitive natural language commands, ensuring the final outputs align closely with individual preferences and requirements. Our user evaluation with 18 participants confirms that \tool not only surpasses state-of-the-art baselines in automated fact sheet production but also provides a positive user experience during customization tasks.
- Media > Film (0.46)
- Information Technology > Security & Privacy (0.46)
A Taxonomy of Challenges to Curating Fair Datasets
Zhao, Dora, Scheuerman, Morgan Klaus, Chitre, Pooja, Andrews, Jerone T. A., Panagiotidou, Georgia, Walker, Shawn, Pine, Kathleen H., Xiang, Alice
Despite extensive efforts to create fairer machine learning (ML) datasets, there remains a limited understanding of the practical aspects of dataset curation. Drawing from interviews with 30 ML dataset curators, we present a comprehensive taxonomy of the challenges and trade-offs encountered throughout the dataset curation lifecycle. Our findings underscore overarching issues within the broader fairness landscape that impact data curation. We conclude with recommendations aimed at fostering systemic changes to better facilitate fair dataset curation practices.
- North America > United States > Arizona (0.04)
- Europe > France (0.04)
- North America > United States > Illinois (0.04)
- (8 more...)
- Research Report > New Finding (1.00)
- Questionnaire & Opinion Survey (1.00)
- Personal > Interview (1.00)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
- (2 more...)
Automatic Histograms: Leveraging Language Models for Text Dataset Exploration
Reif, Emily, Qian, Crystal, Wexler, James, Kahng, Minsuk
Making sense of unstructured text datasets is perennially difficult, yet increasingly relevant with Large Language Models. Data workers often rely on dataset summaries, especially distributions of various derived features. Some features, like toxicity or topics, are relevant to many datasets, but many interesting features are domain specific: instruments and genres for a music dataset, or diseases and symptoms for a medical dataset. Accordingly, data workers often run custom analyses for each dataset, which is cumbersome and difficult. We present AutoHistograms, a visualization tool leveragingLLMs. AutoHistograms automatically identifies relevant features, visualizes them with histograms, and allows the user to interactively query the dataset for categories of entities and create new histograms. In a user study with 10 data workers (n=10), we observe that participants can quickly identify insights and explore the data using AutoHistograms, and conceptualize a broad range of applicable use cases. Together, this tool and user study contributeto the growing field of LLM-assisted sensemaking tools.
- North America > United States > New York > New York County > New York City (0.14)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > Virginia (0.04)
- (8 more...)
- Questionnaire & Opinion Survey (1.00)
- Research Report (0.83)
We are all AI's free data workers
The secret to making AI chatbots sound smart and spew less toxic nonsense is to use a technique called reinforcement learning from human feedback, which uses input from people to improve the model's answers. It relies on a small army of human data annotators who evaluate whether a string of text makes sense and sounds fluent and natural. They decide whether a response should be kept in the AI model's database or removed. Even the most impressive AI chatbots require thousands of human work hours to behave in a way their creators want them to, and even then they do it unreliably. The work can be brutal and upsetting, as we will hear this week when the ACM Conference on Fairness, Accountability, and Transparency (FAccT) gets underway.
Why is AI not a Panacea for Data Workers? An Interview Study on Human-AI Collaboration in Data Storytelling
Li, Haotian, Wang, Yun, Liao, Q. Vera, Qu, Huamin
Data storytelling plays an important role in data workers' daily jobs since it boosts team collaboration and public communication. However, to make an appealing data story, data workers spend tremendous efforts on various tasks, including outlining and styling the story. Recently, a growing research trend has been exploring how to assist data storytelling with advanced artificial intelligence (AI). However, existing studies may focus on individual tasks in the workflow of data storytelling and do not reveal a complete picture of humans' preference for collaborating with AI. To better understand real-world needs, we interviewed eighteen data workers from both industry and academia to learn where and how they would like to collaborate with AI. Surprisingly, though the participants showed excitement about collaborating with AI, many of them also expressed reluctance and pointed out nuanced reasons. Based on their responses, we first characterize stages and tasks in the practical data storytelling workflows and the desired roles of AI. Then the preferred collaboration patterns in different tasks are identified. Next, we summarize the interviewees' reasons why and why not they would like to collaborate with AI. Finally, we provide suggestions for human-AI collaborative data storytelling to hopefully shed light on future related research.
- Information Technology > Human Computer Interaction > Interfaces (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.46)
How to Break Data Silos to Drive Enterprise-Wide AI - Splice Machine
Not many people miss having to manually sort files, label papers, or search for lost forms in huge filing cabinets. That's because all these tasks have become way easier, faster, and more enjoyable since they've become digitized – computers and the internet have revolutionized the way businesses approach organization and task management. Similar to how computers and the internet made monotonous tasks faster and easier in every department, AI will transform work in every industry in the 21st century. Machine learning will automate away the most time-consuming and repetitive tasks across a company, along with offering predictions that will allow businesses to make better decisions ahead of time. Introducing these revolutionary processes takes time and specialized knowledge.